Core microbiome-associated proteins associated with ulcerative colitis interact with cytokines for synergistic or antagonistic effects on gut bacteria

Abstract Inflammatory bowel disease (IBD), including Crohn’s disease (CD) and ulcerative colitis (UC), is associated with a loss or an imbalance of host–microorganism interactions. However, such interactions at protein levels remain largely unknown. Here, we applied a depletion-assisted metaproteomics approach to obtain in-depth host–microbiome association networks of IBD, where the core host proteins shifted from those maintaining mucosal homeostasis in controls to those involved in inflammation, proteolysis, and intestinal barrier in IBD. Microbial nodes such as short-chain fatty-acid producer-related host–microbial crosstalk were lost or suppressed by inflammatory proteins in IBD. Guided by protein–protein association networks, we employed proteomics and lipidomics to investigate the effects of UC-related core proteins S100A8, S100A9, and cytokines (IL-1β, IL-6, and TNF-α) on gut bacteria. These proteins suppressed purine nucleotide biosynthesis in stool-derived in vitro communities, which was also reduced in IBD stool samples. Single species study revealed that S100A8, S100A9, and cytokines can synergistically or antagonistically alter gut bacteria intracellular and secreted proteome, with combined S100A8 and S100A9 potently inhibiting beneficial Bifidobacterium adolescentis. Furthermore, these inflammatory proteins only altered the extracellular but not intracellular proteins of Ruminococcus gnavus. Generally, S100A8 induced more significant bacterial proteome changes than S100A9, IL-1β, IL-6, and TNF-α but gut bacteria degrade significantly more S100A8 than S100A9 in the presence of both proteins. Among the investigated species, distinct lipid alterations were only observed in Bacteroides vulgatus treated with combined S100A8, S100A9, and cytokines. These results provided a valuable resource of inflammatory protein-centric host–microbial molecular interactions.

Metaproteomics holds the promise to promote our understanding of how gut microorganisms interact with each other and with their host [7][8][9][10][11], but suffers from the insufficient sensitivity to identify low-abundance microbial species and proteins.Several strategies such as immunodepletion, enrichment, and fractionation have been employed to increase the depth of proteomics [12][13][14][15] or metaproteomics [8,9].A recent study has revealed that abundant proteins in cells can be depleted using an ultralow concentration of trypsin predigestion [16,17].However, this strategy has not been evaluated in more complex biological samples such as stool.
In this study, we developed a trypsin predigestion-based deep metaproteomic approach, with which we found altered hostmicrobial protein association networks in IBD.Focused on core inf lammation-associated proteins, our study demonstrated that S100A8 and S100A9 can act on gut bacteria synergistically or antagonistically with cytokines IL-1β and TNF-α.

Subjects and stool sample collections
A total of 89 subjects (26 CD patients, 29 UC patients, and 34 controls) were recruited from The Fifth Affiliated Hospital of Sun Yat-sen University (Table S1).IBD was diagnosed based on criteria regarding clinical symptoms, laboratory examination, imaging, endoscopy, and histopathology [18,19].Control subjects met the following criteria: no gut mucosa inf lammation, no infection diseases, and no antibiotics treatment within 4 weeks.Informed consent was obtained from each subject.Stool samples were collected on ice and transferred to the laboratory to be stored at −80 • C immediately.Before the sample collection, none of the patients had received any medical treatment.The severity of CD and UC was evaluated on the CDAI (Crohn's disease activity index) [20] and UCAI (ulcerative colitis activity index) [21], respectively.The IBD subtype was defined as the Montreal classification [22].This study was approved by the Ethics Committee of The Fifth Affiliated Hospital, Sun Yat-sen University (L037-1).

In-depth metaproteomic analysis of stool samples
Bacteria in fecal samples (∼300 mg) were enriched by differential centrifugation.After optimization, a protein:trypsin w:w ratio of 25 000:1 was employed to deplete dominant proteins [16,17].The remaining proteins were further digested using a protein:trypsin w:w ratio of 50-100:1.Peptides were divided into three fractions using SDB-RPS and analyzed using a Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific).PEAKS was employed for protein identification using our previously reported comprehensive database (130 975 891 sequences) comprised of human, microbial, and dietary organisms [7].The two-step strategy was employed to increase the sensitivity of large database searching [7][8][9].Proteins were quantified based on peptide-spectrum matches (PSMs)-based label-free quantification (Supplemental methods).

Functional and microbial taxonomy analysis
Taxonomic and functional analyses of all identified peptides were performed by Unipept (http://unipept.ugent.be)[25] with equated I and L and advanced missing cleavage handling.Gene Ontology (GO) terms of peptides were used as functional annotations.The relative abundance of functional groups was calculated by normalizing the number of corresponding peptides to the total number of peptides.The microbial taxonomic abundance was calculated by summing up the intensity of corresponding peptides.Proteins from microbiota were annotated using iMetaLab [26] and Uniprot.

Statistical analysis and data visualization
Proteins were normalized by the sum of the samples before further statistical analysis.Statistical significance was assessed by the Kruskal-Wallis test among three groups (present in at least 50% of samples), with missing values replaced by a 0.2fold minimum value by metaboAnalyst [27].MaAsLin2 was used to compute the q value for age adjustment [28].Taxonomyfunction interactions were generated using iMetaLab.Partial least squares discriminant analysis (PLS-DA) and correlation analysis were performed using RStudio (4.4.2).The interaction networks were calculated by Spearman's rank correlation (false discovery rate (FDR) < 0.25) and visualized using Cytoscape.Heatmaps, Venn diagrams, and principal coordinate analysis (PCoA) were generated by ImageGP [29].
Inf lammation can reshape gut bacterial composition [42].S100A9, playing a prominent role in the regulation of inf lammatory processes and immune responses, did not exhibit any significant correlation with gut bacteria in controls.However, S100A9 was positively associated with many potential pathogenic bacteria in UC, such as Actinomycetes, Actinomycetota, and Micrococcales, and Micrococcales was related to UC severity (Fig. S8a).S100A9 in UC could also suppress beneficial gut bacteria in the Firmicutes phylum based on its negative correlations with Roseburia (butyric acid producer), Blautia (with potential probiotic properties), Coprobacillaceae, and Oscillatoriales.
In the control group, the probiotic Blautia wexlerae was positively associated with apolipoprotein D (APOD).However, such associations were lost in IBD.Compared with other groups, a unique feature of core bacteria in the network of UC was Coprobacillaceae, which was positively associated with host chymotrypsin-C (CTRC), CELA3A, PRSS2, and KRT5 and negatively associated with inf lammatory proteins, such as MPO and S100 family proteins (S100A8, S100A9, and S100A12), suggesting this gut species may play a role in the pathogenesis of UC (Fig. 5A).

Disease-specific networks of host-microbial protein functional associations
Host-microbial functional association networks also exhibited a disease-specific pattern in terms of nodes and the links between nodes (positive or negative) (Fig. 5C and D, S7, and Table S5).For instance, homeostasis of branched-chain amino acids (BCAAs) (leucine, isoleucine, and valine) is essential for mammalian health.BCAAs cannot be synthesized endogenously by human but can be produced by gut bacteria.In the control group, microbial biosynthesis of BCAAs was negatively associated with human proteoglycan 3 (PRG3) and myeloblastin (PRTN3) (Fig. 5D).However, microbial BCAAs biosynthesis was positively associated with host proteases Ectonucleotide pyrophosphatase/phosphodiesterase family member 7 (ENPP7), CELA3A, and keratin (KRT6A, KRT6B, and KRT5) in UC.In most cases, host inf lammationassociated proteins S100A9 and S100A12 were negatively associated with the majority of gut microbial functions such as monosaccharide metabolic and purine nucleotide biosynthetic process, suggesting inf lammation can suppress the metabolism of gut microbiome.Three important exceptions were threonine biosynthesis, intracellular iron ion homeostasis (positively related to CD), and ferric iron binding, which were positively correlated with S100A9 and/or S100A12 (Fig. 5C, S7a and b, and S8b, Table S5).Threonine is essential for intestinal mucin synthesis and threonine-rich glycoproteins production, which can protect mucosal epithelium from injury.In the animal model, more threonine was required when inf lammation existed in the intestine [38].Another exception was that isomerase activity was negatively associated with many host proteins in both CD and UC [for example, S100A8, S10012, MPO, serum amyloid Pcomponent (APCS), and tissue factor (TF)], but this function was only positively associated with MUC2 in the control group.Additionally, intracellular iron ion homeostasis (positively related to CD severity) (Fig. S8b) exhibited a positively correlation with S100A8 and TF in UC, but was negatively associated with host proteins in control.
Mucus layers are the frontline of host-gut bacteria interaction.Host-secreted gel-forming mucin MUC2 was positively correlated ) and MaAsLin2 to adjust age between two groups, * q < 0.05, * * q < 0.01, * * * q < 0.001).Principal coordinates analysis (PCoA) analysis of host proteins (H) as well as microbial proteins (i) based on Bray-Curtis distance.Ellipses represent a 95% confidence interval for each group.Statistics was calculated using pairwise PERMANOVA analyses with the function adonis from the vegan package.
with translation, Guanosine-5'-Monophosphate (GMP) biosynthesis, and ribonucleotide binding in control ( Figs 5D and S7b), but there was no such association in UC (Figs 5C and S7A).These results suggest distributed host-microbial functional interactions in the mucus layer in IBD.

In vitro gut microbiota treated with S100A8 and cytokines partially resemble inflammatory bowel disease stool metaproteomic alterations
We further investigated the inf luence of S100A8, a key hostmicrobial association protein in UC (Fig. 5A) on stool-derived in Figure 3. Bacterial protein alterations of eleven main species in IBD and control groups.Significant changes were evaluated by MaAsLin2 (q < 0.25) to adjust age between two groups after Kruskal-Wallis test (FDR < 0.05).The colors in the heatmap represent the average of relative abundance in each group.* q < 0.05 versus control; * * q < 0.01 versus control; * * * q < 0.001 versus control.* q < 0.05, * * q < 0.01, * * * q < 0.001.vitro gut microbial culture derived from five healthy individuals.We also compared them with pro-inf lammatory cytokines increased in IBD patients, including IL-1β, IL-6, and TNF-α.Protein concentrations were selected based on previous in vitro assays and their physiological concentrations in fecal samples [43][44][45].Partial least squares discriminant analysis (PLS-DA) of metaproteomics data revealed that S100A8 (30 μg/ml) treatment induced the most significant alterations based on peptide and GO analysis compared to different concentrations of cytokines (Fig. 6A  and B).However, taxonomy analysis indicated that IL-1β was more divergent than other groups in PC1 dimension (Fig. 6C).Some microbial functional alterations of in vitro culture induced by these inf lammation-associated proteins (paired T-test, P < .1,Table S6) were also observed in IBD stool samples (FDR < 0.05, Figs 6D and 7E).All tested human proteins (S100A8, IL-1β, IL-6, and TNF-α) at diverse concentrations suppressed microbial purine nucleotide biosynthesis process, which was also reduced in IBD stool samples (Figs 4A and 6D).In addition, protein transport by the Sec complex was inhibited by different cytokines (but not S100A8) and reduced in IBD stool samples.In addition, some reverse trends were also observed in S100A8 and cytokines treated in vitro samples compared with IBD stool samples, where threonine synthase was increased in IBD and decreased in vitro (Fig. S6a and e).

S100A8, S100A9, and cytokines synergistically or antagonistically alter gut bacteria intracellular proteome
To get deeper insights into the effects of host inf lammationassociated proteins (S100A8, S100A9, IL-1β, and TNF-α) on gut bacteria, we further investigated the proteome response of four representative species relevant to IBD, including B. vulgatus, whose proteases link with UC disease severity [46], Bifidobacterium adolescentis, a SCFA producer decreased in IBD [3], Enterocloster bolteae, and Ruminococcus gnavus, which is associated with CD [ 47].We evaluated the effect of host proteins on bacterial growth rate (Table S7), in which S100A8 + S100A9, S100A8 + S100A9 + IL-1β, and S100A8 + S100A9 + TNF-α reduced the total bacterial abundance of B. vulgatus (Fig. 7A), B. adolescentis (Fig. 7E), and R. gnavus (Fig. 7M).In E. bolteae, these combination treatment groups inhibited the growth rate rather than the final bacterial abundance (Fig. 7I).However, no such inhabitation effects were observed in the separate IL-1β, TNF-α, S100A8, or S100A9 treatment groups of these bacteria.In proteomic analysis, similar to the above stool in vitro culture results, S100A8 induced the most distinct proteomic alterations compared to S100A9, IL-1β, and TNF-α (Fig. 7, Tables S6 and S8).S100A8 (but not the other three proteins) could be fully separated from control in the PC1 dimension in PLS-DA and produced much more differential proteins.

S100A8, S100A9, and cytokines synergistically or antagonistically alter gut bacteria secretome
We analyzed the bacterial secretory protein alterations (Fig. 8, Tables S11 and S12).PLS-DA scatter plots (Fig. 8A, D, F, and J) revealed a distinct separation between S100A8 and control in both PC1 and PC2 dimensions in B. vulgatus, B. adolescentis, and E. bolteae with a large number of altered proteins (Fig. 8B, E, H, K).S100A9 triggered secretory protein alterations of B. vulgatus and E. bolteae, whereas IL-1β only alterated R. gnavus secretome.In all cases, TNF-α did not significantly altered the secretome of the studied four gut bacteria and clustered with the control group in PLS-DA.For B. vulgatus, combing S100A9, TNF-α, or IL-1β with S100A8 reduced the number of significantly increased proteins from 137 in S100A8 to 29-38 in the combination groups, whereas increased the number of significantly reduced proteins from 68 to 94-101 (Fig. 8B).This phenomenon was not observed in B. adolescentis, E. bolteae, or R. gnavus, where the proportions of increased or decreased proteins were comparable in different groups (Fig. 8E, h, and K).
Compared with the other three species, a unique alteration of E. bolteae secreted proteins was choline-binding protein, which exhibited a 6.6-fold reduction in S100A9, and a 38.1-40.0-foldreduction in S100A8, S100A8 + S100A9, S100A8 + S100A9 + IL-1β, and S100A8 + S100A9 + TNF-α.Choline, present in a variety of foods, is an essential nutrient for the host.Host-microbial cometabolism of choline plays an important role in host fitness, and certain metabolites such as trimethylamine N-oxide (TMAO) may raise the risk of cardiovascular diseases [48].Our study indicated that E. bolteae may be involved in the metabolism of choline and host inf lammatory proteins can affect the microbial metabolism of choline by downregulating cholinebinding protein.

Discussion
We demonstrated that ultralow trypsin predigestion is a straightforward approach to improve the coverage of low-abundance microbial proteins (especially membrane, cell wall, and extracellular proteins) and maintain the overall relative abundance of most taxonomic groups in metaproteomics.This is probably because ultralow trypsin prefers to digest highly abundant host protein-dominated fecal samples, depletion of which results in more chances to sequence less abundant microbial proteins by mass spectrometry.
Network-centric analysis revealed disease-specific hostmicrobial protein associations in IBD.The core host proteins in the network shifted from those maintaining mucosal homeostasis (MUC2 and ASAG2) in control to those involved in inf lammation (S100A8, S100A9, and MPO), proteolysis, and intestinal barrier (KRT6A and KRT6B) in IBD.Many functionally important microbial nodes in the network were largely lost or suppressed in IBD.Our network analysis revealed that inf lammatory proteins can promote potential pathogenic bacteria and suppress beneficial gut bacteria in UC.Restoring those key microbial functions and suppressing disease-specific bacteria may contribute to treating IBD.However, our peptide-centric taxonomic and functional analysis may suffer from information loss and misassignments because peptides are generally short and thus are hard to assign to specific proteins.Since the microbial sequences in the database used are not specific to the samples analyzed, the accuracy of the peptide and proteins identified may be impacted.The two-step searching approach could also underestimate FDR.Identification confidence could be improved by combining results from multiple software.
In accordance with our finding that S100A8 and S100A9 were among the core microbiome associated host proteins, fecal calprotectin (S100A8/A9) is both sensitive and specific to IBD disease activity and histological severity surveillance [49].In addition, pro-inf lammatory cytokines (IL-1β, IL-6, and TNF-α) have been reported to be able to bind bacteria and affect the growth rate of Bifidobacteria longum GT15 strain and certain gene expression [50][51][52].However, the detailed molecular effects of these host inf lammatory proteins on gut microorganisms have not been systemically studied by multi-omics approaches.The present metaproteomics study demonstrated that purine nucleotide biosynthesis was suppressed in both IBD stools and in vitro stool microbial cultures (derived from healthy subjects) treated with S100A8, IL-1β, IL-6, and TNF-α.Host S100A8/A9 has been reported to induce bacterial metal starvation through chelation of nutrients (zinc and manganese) [53], and can result in robust transcriptional alterations in E. coli [54].Accordingly, our present study revealed that the most downregulated secretory proteins of B. vulgatus by these host proteins was heme-binding protein FetB involved in heme capture.
Our study of single gut bacteria revealed that S100A8 + S100A9, S100A8 + S100A9 + IL-1β, and S100A8 + S100A9 + TNF-α suppressed a large number of biological processes of beneficial B. adolescentis intracellular and extracellular proteome, indicating these human inf lammatory proteins may be involved in the inhibitions of probiotics.IL-1β, S100A8 + S100A9, S100A8 + S100A9 + IL-1β, and S100A8 + S100A9 + TNF-α altered a large number of R. gnavus secretory proteins, but not intracellular proteins.S100A8/A9 is present in ∼50% of the neutrophil cytoplasm content and also produced by monocytes/macrophages and potentially epithelial cells [55], and cytokines (IL-1β, IL-6, and TNF-α) can be released by monocytes/macrophages after bacteria and endotoxin stimuli.Furthermore, S100A8 can bind and capture cytokines in a noncovalent manner and inhibit the generation of IL-6 and TNF-α by monocytes [56,57].Different host cells can exert synergistic and antagonistic regulation effects on gut bacteria through these secreted proteins.Overall, our study reveals complex interactions between host inf lammatory proteins and gut bacteria.

Figure 1 .
Figure 1.Deep metaproteomic using ultralow trypsin concentration digestion and ex vivo microorganism culture for host-microbial interaction analysis.(A) The workf low of depletion-assisted deep metaproteomics and ex vivo culture experiments.Microbial proteins were digested with ultralow concentration trypsin and filtered to remove high-abundance peptides.The remaining proteins were subjected to normal trypsin digestion and off-line fractionation for downstream metaproteomic analysis.The ex vivo microorganism culture models were incubated with inf lammatory proteins to evaluate the host-microbial interaction.(B-M) The performance of the ultralow trypsin concentration digestion method.Total peptide (B) and protein (C) numbers in samples digested with normal method and gradient ultralow trypsin concentration treatments in two samples.Total peptide-spectrum match (PSM) numbers (D), MS/MS (MS2) numbers (E), and PSMs/MS2 ratios (F) in different treatment groups.The percentages of bacterial protein intensity (G) and protein number (H).The relative abundance of bacterial taxonomy identified by peptides at the genus (I) and species level (J).(K) Comparison of bacterial cellular component enrichment based on peptides in ultralow and control groups.(L) Plots of the fold change of proteins (ultralow 1:25 000 to control groups, n = 2) versus the relative abundance of proteins in control.Data points are plotted on the basis of protein peak area from two biological replicates.(M) Relative abundance changes of the top five proteins from ultralow 1:25 000 and control group in sample 1 (P = .041)and sample 2 (P = .008),two-sided paired t-test.

Figure 2 .
Figure 2. Overview of host and microbial proteome profiling in IBD patients.(A) Total peptide number identified in all samples and the percentage of different organism in metaproteomics.(B) Protein numbers quantified in human, microbiome and dietary organism.(C) Percentage of protein intensity distribution.(D) Boxplot of human to bacteria ratio (log 2 -transformed) based on protein number.* P < .05,* * P < .01,* * * P < .001,Kruskal-Wallis test with post hoc Dunn-Bonferroni analysis.(E) Relative abundance of fungi to bacteria (log 2 -transformed).* P < .05,* * P < .01,* * * P < .001,Kruskal-Wallis test with post hoc Dunn-Bonferroni analysis.(F) Pearson's rank correlation coefficient r of fungi abundance and disease severity scores.CDAI scores for CD and UCAI scores for UC were used to determine disease severity.(G) Discriminant fungal proteins between IBD and control (Kruskal-Wallis test (FDR < 0.05) and MaAsLin2 to adjust age between two groups, * q < 0.05, * * q < 0.01, * * * q < 0.001).Principal coordinates analysis (PCoA) analysis of host proteins (H) as well as microbial proteins (i) based on Bray-Curtis distance.Ellipses represent a 95% confidence interval for each group.Statistics was calculated using pairwise PERMANOVA analyses with the function adonis from the vegan package.

Figure 4 .
Figure 4. Microbial function alterations in IBD patients.Normalized relative abundance (log 2 -transformed) of altered biological processes (A) and molecular functions (B).Kruskal-Wallis test was employed to evaluate statistical difference among three groups (control, CD, and UC).Function with FDR < 0.05 are shown.The q value between two groups was calculated by MaAsLin2 to adjust age.Box plots indicate the first (bottom line), medium (central line), and third (top line) quartiles of the data.Samples are shown as dots (missing values are not included).Outliers >1.5 times the interquartile range (IQR) are indicated as dots.The left of the dotted line represents decreased functions and the right represents increased functions.*q < 0.05, * * q < 0.01, * * * q < 0.001.

Figure 5 .
Figure 5. Co-occurrence networks of host proteins and microbiome that were differentially expressed in UC and control.Correlations of differential host proteins and altered gut microbial composition in UC (A) and control (B) (FDR < 0.25).Correlations of differential host proteins and altered microbial biological processes in UC (C) and control (D) (FDR < 0.25).The correlations in networks are calculated by Spearman's rank correlation (FDR < 0.25).The circle indicates human proteins, and the square indicates microbiome.The size and color of nodes are proportional to the connection number (degree) and q value, respectively.The host protein nodes in control are shown in the same color.The edge color is proportional to the Spearman's rank correlation.